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Abstract 

We introduce the concept of "negative bubbles" as the mirror (but not necessarily exactly sym- 
metric) image of standard financial bubbles, in which positive feedback mechanisms may lead to 
transient accelerating price falls. To model these negative bubbles, we adapt the Johansen-Ledoit- 
Sornette (JLS) model of rational expectation bubbles with a hazard rate describing the collective 
buying pressure of noise traders. The price fall occurring during a transient negative bubble can be 
interpreted as an effective random down payment that rational agents accept to pay in the hope of 
profiting from the expected occurrence of a possible rally. We validate the model by showing that 
it has significant predictive power in identifying the times of major market rebounds. This result is 
obtained by using a general pattern recognition method that combines the information obtained at 
multiple times from a dynamical calibration of the JLS model. Error diagrams, Bayesian inference 
and trading strategies suggest that one can extract genuine information and obtain real skill from 
the calibration of negative bubbles with the JLS model. We conclude that negative bubbles are in 
general predictably associated with large rebounds or rallies, which are the mirror images of the 
crashes terminating standard bubbles. 
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I. INTRODUCTION 



Financial bubbles a re gene rally defined as transient upward acceleration of prices above 



fundamental value 
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30 



43|. However, identifying unambiguously the presence of a bubble 



remains an unsolved problem in standard econometric and financial economic approaches 16l . 
due to the fact that the fundamental value is in general poorly constrained and it is not 



possible to distinguish between exponentially growing fundamental price and exponentially 
growing bubble price. 

To break this stalemate, Sornette and co-workers have proposed that bubbles are actually 
not characterized by exponential prices (sometimes referred to as "explosive"), but rather 
by faster-than-exponential growth of price (that should therefore be referred to as "super- 



explosive"). See 43| and references therein. The reason for such faster-than-exponential 



regimes is that imitation and herding behavior of noise traders and of boundedly rational 
agents create positive feedback in the valuation of assets, resulting in price processes that 
exhibit a finite-time singularity at some future time See 15| for a general theory of finite- 



ly] for a classification and 



26 



40 



45[ for 



time singularities in ordinary differential equations, 
applications. This critical time tc is interpreted as the end of the bubble, which is often but 
not necessarily the time when a crash occurs f^]. Thus, the main difference with standard 
bubble models is that the underlying price process is considered to be intrinsically transient 
due to positive feedback mechanisms that create an unsustainable regime. Furthermore, 
the tension and competition between the value investors and the noise traders may create 
deviations around the finite-time singular growth in the form of oscillations that are periodic 
in the logarithm of the time to tc- Log-periodic oscillations appear to our clocks as peaks 
and valleys with progressively greater frequencies that eventually reach a point of no return, 
where the unsustainable growth has the highest probability of ending in a violent crash or 
gentle deflation of the bubble. Log-periodic oscillations are associated with the symmetry of 
discrete scale invariance, a partial breaking of the symmetry of continuous scale invariance, 
and occurs in complex systems characterized by a hierarchy of scales. See 42| for a general 
review and references therein. 

Recent literatures on bubbles and crashes can be summarized as the following kinds: first, 
the combined effects of heterogeneous beliefs and short-sales constraints may cause large 
movements in asset. In this kind of models, the asset prices are determined at equilibrium 
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to the extent that they reflect the heterogeneous behefs about payoffs. But short sales 
restrictions force the pessimistic investors out of the market, leaving only optimistic investors 
and thus inflated asset price levels. However, when short sales restrictions no longer bind 



investors, then prices fall back down 
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20 



32 



35 



4l|. While in the second 



type, the role of "noise traders" in fostering positive feedback trading has been emphasized. 
These models says trend chasing by one class of agents produces momentum in stock prices 
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18| . The empirical evidence on momentum strategies can be found in 



21 



22|. 



After the discussion on bubbles and crashes, the literatures on rebound should be sum- 
marized also. On the theoretical side, there are several competing explanations for price 
decreases followed by reversals: liquidity and time- varying risk. 39| stresses the importance 
of liquidity: as more people sell, agents who borrowed money to buy assets are forced to 
sell too. When forced selling stops, this trend reverses. 



shows that it is risky to be a 



fundamental trader in this environment and that price reversals after declines are likely to 
be higher when there is more risk in the price, as measured by volatility. On the empirical 
front concerning the forecast of reversals in price drops, I2I] shows that the simplest way 
to predict prices is to look at past performance. 5[ shows that price-dividend ratios fore- 
cast future returns for the market as a whole. However, these two approaches do not aim 
at predicting and cannot determine the most probable rebound time for a single ticker of 
the stock. The innovation of our methodology in this respect is to provide a very detailed 
method to detect rebound of any given ticker. 

In this paper, we explore the hypothesis that flnancial bubbles have mirror images in the 
form of "negative bubbles" in which positive feedback mechanisms may lead to transient 



accelerating price fal 
expectation bubbles 



s. We adapt the Johansen-Ledoit-Sornette (JLS) model of rational 



25 



29| to negative bubbles. The crash hazard rate becomes the 



rally hazard rate, which quantifles the probability per unit time that the market rebounds 
in a strong rally. The upward accelerating bullish price characterizing a bubble, which 
was the return that rational investors require as a remuneration for being exposed to crash 
risk, becomes a downward accelerating bearish price of the negative bubble, which can be 
interpreted as the cost that rational agents accept to pay to proflt from a possible future 
rally. During this accelerating downward trend, a tiny reversal could be a strong signal for 
all the investors who are seeking the proflt from the possible future rally. These investors 
will long the stock immediately after this tiny reversal. As a consequence, the price rebounds 
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very rapidly. 

This paper contributes to the hterature by augmenting the evidence for transient pockets 
of predictabihty that are characterized by faster-than-exponential growth or decay. This 
is done by adding the phenomenology and modeling of "negative bubbles" to the evidence 
for characteristic signatures of (positive) bubbles. Both positive and negative bubbles are 
suggested to result from the same fundamental mechanisms, involving imitation and herd- 
ing behavior which create positive feedbacks. By such a generalization within the same 
theoretical framework, we hope to contribute to the development of a genuine science of 
bubbles. 

The rest of the paper is organized as follows. Section 2.1 summarizes the main defini- 
tions and properties of the Johansen-Ledoit-Sornette (JLS) for (positive) bubbles and their 
associated crashes. Section 2.2 presents the modified JLS model for negative bubbles and 
their associated rebounds (or rallies). The subsequent sections test the JLS model for neg- 
ative bubbles by providing different validation steps, in terms of prediction skills of actual 
rebounds and of abnormal returns of trading strategies derived from the model. Section 3 
describes the method we have developed to test whether the adapted JLS model for negative 
bubbles has indeed skills in forecasting large rebounds. This method uses a robust pattern 
recognition framework build on the information obtained from the calibration of the adapted 
JLS model to the financial prices. Section 4 presents the results of the tests concerning the 
performance of the method of section 3 with respect to the advanced diagnostic of large 
rebounds. Section 5 develops simple trading strategies based on the method of section 3, 
which are shown to exhibit statistically significant returns, when compared with random 
strategies without skills with otherwise comparable attributes. Section 6 concludes. 



II. THEORETICAL MODEL FOR DETECTING REBOUNDS 



A. Introduction to the JLS model and bubble conditions 



25| . |29| . |24j developed a model (referred to below as the JLS model) of financial bubbles 



In 



and crashes, which is an extension of the rational expectation bubble model of 
this model, a crash is seen as an event potentially terminating the run-up of a bubble. A 
financial bubble is modeled as a regime of accelerating (super-exponential power law) growth 
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punctuated by short-lived corrections organized according the symmetry of discrete scale 
invariance j42|. The super-exponential power law is argued to result from positive feedback 
resulting from noise trader decisions that tend to enhance deviations from fundamental 
valuation in an accelerating spiral. 

In the JLS model, the dynamics of stock markets is described as 

^ = ii{t)dt + a{t)dW - ndj , (1) 

where p is the stock market price, /i is the drift (or trend) and dW is the increment of a 
Wiener process (with zero mean and unit variance). The term dj represents a discontinuous 
jump such that dj = before the crash and dj = 1 after the crash occurs. The loss amplitude 
associated with the occurrence of a crash is determined by the parameter k. The assumption 
of the constant jump size is easily relaxed by considering a distribution of jump sizes, with 
the condition that its first moment exists. Then, the no-arbitrage condition is expressed 
similarly with n replaced by its mean. Each successive crash corresponds to a jump of dj by 
one unit. The dynamics of the jumps is governed by a crash hazard rate h{t). Since h{t)dt 
is the probability that the crash occurs between t and t + dt conditional on the fact that it 
has not yet happened, we have Et[dj] = 1 x h{t)dt + x (1 — h{t)dt) and therefore 

E,[rfj] = h{t)dt . (2) 

Under the assumption of the JLS model, noise traders exhibit collective herding behaviors 
that may destabilize the market. The JLS model assumes that the aggregate effect of noise 
traders can be accounted for by the following dynamics of the crash hazard rate 

h{t) = B'{t, - t)""^ + C'{t, - t)"-i cos{u ln{tc -t)- (f)') . (3) 



The intuition behind this specification ([3]) has been presented at length in 2J, |25|, |29| , among 



others, and further developed in (Sornette and Johansen, 2002) for the power law part and 



by 19| and (Zhou et al., 2005) for the second term in the right- hand- side of expression 
(jS]). In a nutshell, the power law behavior ~ tc — t)™^^ embodies the mechanism of positive 
feedback posited to be at the source of the bubbles. If the exponent m < 1, the crash hazard 
may diverge as t approaches a critical time tc, corresponding to the end of the bubble. The 
cosine term in the r.h.s. of ([3]) takes into account the existence of a possible hierarchical 
cascade of panic acceleration punctuating the course of the bubble, resulting either from a 



preexisting hierarchy in noise trader sizes 4J] and/or from the interplay between market 



19| 



price impact inertia and nonhnear fundamental value investing 

The no-arbitrage condition reads Ef[(ip] = 0, where the expectation is performed with 
respect to the risk-neutral measure, and in the frame of the risk-free rate. This is the stan- 
dard condition that the price process is a martingale. Taking the expectation of expression 
(II]) under the filtration (or history) until time t reads 



Et[dp] = fL{t)p{t)dt + a{t)p{t)Et[dW] - Kp{t)Et[dj] 



(4) 



Since E^c^VT] = and Et[dj] = h(t)dt (equation (|2])), together with the no-arbitrage condi- 
tion Et[(ip] = 0, this yields 

/i(t) = Kh{t) . (5) 

This result (j5]) expresses that the return fj,{t) is controlled by the risk of the crash quantified 
by its crash hazard rate h{t). 

Now, conditioned on the fact that no crash occurs, equation ([T]) is simply 

dp 



P 



H{t)dt + a{t)dW = Kh{t)dt + a{t)dW 



(6) 



Its conditional expectation leads to 



Et 



dp 
P 



nh{t)dt 



(7) 



Substituting with the expression (|3]) for h{t) and integrating yields the so-called log-periodic 
power law (LPPL) equation: 



In E[p(t)] =A + Bit, - t)"" + C{t, - tr cos(w \n{t, - t) 



(8) 



where B = —KB'/m and C = —KC'/ym? + uj^. Note that this expression ([8]) describes 
the average price dynamics only up to the end of the bubble. The JLS model does not 
specify what happens beyond tc- This critical is the termination of the bubble regime 
and the transition time to another regime. This regime could be a big crash or a change of 
the growth rate of the market. Merrill Lynch EMU (European Monetary Union) Corporates 



Non-Financial Index in 2009 



46| provides a vivid example of a change of regime characterized 



by a change of growth rate rather than by a crash or rebound. For m < 1, the crash hazard 
rate accelerates up to but its integral up to t which controls the total probability for a 
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crash to occur up to t remains finite and less than 1 for all times t < t^. It is this property 
that makes it rational for investors to remain invested knowing that a bubble is developing 
and that a crash is looming. Indeed, there is still a finite probability that no crash will occur 
during the lifetime of the bubble. The excess return = nh{t) is the remuneration that 
investors require to remain invested in the bubbly asset, which is exposed to a crash risk. 
The condition that the price remains finite at all time, including tc, imposes that m > 0. 

Within the JLS framework, a bubble is qualified when the crash hazard rate accelerates. 
According to ([3]), this imposes m < 1 and S' > 0, hence -B < since m > by the condition 
that the price remains finite. We thus have a first condition for a bubble to occur 

< m < 1 . (9) 

By definition, the crash rate should be non- negative. This imposes 48 1 



b = -Bm-\C\^/m?TuP>Q . (10) 



B. Modified JLS model for "negative bubbles" and rebounds 

As recalled above, in the JLS framework, financial bubbles are defined as transient regimes 
of faster-than-exponential price growth resulting from positive feedbacks. We refer to these 
regimes as "positive bubbles." We propose that positive feedbacks leading to increasing 
amplitude of the price momentum can also occur in a downward price regime and that 
transient regimes of faster-than-exponential downward acceleration can exist. We refer to 
these regimes as "negative bubbles." In a "positive" bubble regime, the larger the price is, 
the larger the increase of future price. In a "negative bubble" regime, the smaller the price, 
the larger is the decrease of future price. In a positive bubble, the positive feedback results 
from over-optimistic expectations of future returns leading to self-fulfilling but transient 
unsustainable price appreciations. In a negative bubble, the positive feedbacks reflect the 
rampant pessimism fueled by short positions leading investors to run away from the market 
which spirals downwards also in a self-fulfilling process. 

The symmetry between positive and negative bubbles is obvious for currencies. If a 
currency A appreciates abnormally against another currency B following a faster-than- 
exponential trajectory, the value of currency B expressed in currency A will correspondingly 
fall faster-than-exponentially in a downward spiral. In this example, the negative bubble is 
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simply obtained by taking the inverse of the price, since the value of currency A in units 
of B is the inverse of the value of currency B in units of A. Using logarithm of prices, this 
corresponds to a change of sign, hence the "mirror" effect mentioned above. 

The JLS model provides a suitable framework to describe negative bubbles, with the only 
modifications that both the expected excess return and the crash amplitude k become 
negative (hence the term "negative" bubble). Thus, /i becomes the expected (negative) 
return (i.e., loss) that investors accept to bear, given that they anticipate a potential rebound 
or rally of amplitude \k\. Symmetrically to the case of positive bubbles, the price loss before 
the potential rebound plays the role of a random payment that the investors honor in order 
to remain invested and profit from the possible rally. The hazard rate h{t) now describes 
the probability per unit time for the rebound to occur. The fundamental equations (E]) and 
dH]) then hold mutatis mutandis with the inequalities 

fi>0, 6<0 (11) 

being the opposite to those corresponding to a positive bubble as described in the preceding 
subsection. 

An example of the calibration of a negative bubble with the JLS model (4) to the S&P 
500 index from 1973-01-01 to 1974-10-01 is shown in the upper panel of Figure 1. During this 
period, the S&P 500 index decreased at an accelerating pace. This price fall was accompanied 
by very clear oscillations that are log-periodic in time, as described by the cosine term in 
formula (4). Notice that the end of the decreasing market is followed by a dramatic rebound 
in index price. We hypothesize that, similar to a crash following an unsustainable super- 
exponential price appreciation (a positive bubble), an accelerating downward price trajectory 
(a negative bubble) is in general followed by a strong rebound. Furthermore, in order to 
suggest that this phenomenon is not an isolated phenomenon but actually happens widely 
in all kinds of markets, another example in the foreign exchange market is presented in 
the lower panel of Figure 1. The USD/EUR change rate from 2006-07-01 to 2008-04-01 
also underwent a significant drawdown with very clear log-periodic oscillations, followed 
by a strong positive rebound. One of the goals of this paper is to identify such regions 
of negative bubbles in financial time series and then use a pattern recognition method to 
distinguish ones that were (in a back-testing framework) followed by significant price rises. 

In financial markets, large positive returns are less frequent than large negative returns. 



9 



as expressed for instance in the skewness of return distributions. However, when studying 
drawdowns and drawups (i.e., runs of same sign returns). Johansen and Sornette found that, 
for individual companies, there are approximately twice as many large rallies as crashes with 
amplitude larger than 20% with durations of a few days 



III. REBOUND PREDICTION METHOD 

We adapt the pattern recognition method of j3] to generate predictions of rebound times 
in financial markets on the basis of the detection and calibration of negative bubbles, defined 
in the previous section. We analyze the S&P 500 index prices, obtained from Yahoo! finance 
for ticker '"GSPC (adjusted close price) |50| . The start time of our time series is 1950-01-05, 
which is very close to the first day when the S&P 500 index became available (1950-01-03). 
The last day of our tested time series is 2009-06-03. 



A. Fitting methods 

We first divide our S&P 500 index time series into different sub-windows [ti, ^2) of length 
dt = t2 — ti according to the following rules: 

1. The earliest start time of the windows is ti = 1950-01-03. Other start times ti are 
calculated using a step size of dti = 50 calendar days. 

2. The latest end time of the windows is t2 = 2009-06-03. Other end times t2 are calcu- 
lated with a negative step size dt2 = —50 calendar days. 

3. The minimum window size dt^i,^ = 110 calendar days. 

4. The maximum window size (itmax = 1500 calendar days. 

These rules lead to 11,662 windows in the S&P 500 time series. 

For each window, the log of the S&P 500 index is fit with the JLS equation ([H]). The fit 
is performed in two steps. First, the linear parameters A, B and C are slaved to the non- 
linear parameters by solving them analytically as a function of the nonlinear parameters. 
We refer to [24] (page 238 and following ones), which gives the detailed equations and pro- 
cedure. Then, the search space is obtained as a 4 dimensional parameter space representing 
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is used to find initial 



m, w, 0, tc- A heuristic search implementing the Tabu algorithm 
estimates of the parameters which are then passed to a Levenberg-Marquardt algorithm 
3ll . |34| | to minimize the residuals (the sum of the squares of the differences) between the 
model and the data. The bounds of the search space are: 

m G [0.001,0.999] (12) 

CO e [0.01,40] (13) 

G [0.001, 27r] (14) 

tc e [t2,t2 + 0.375(t2-ti)] (15) 

We choose these bounds because m has to be between and 1 according to the discussion 
before; the log-angular frequency u should be greater than 0. The upper bound 40 is large 
enough to catch high-frequency oscillations (though we later discard fits with u > 20); phase 
should be between and 27r; as we are predicting a critical time in financial markets, the 
critical time should be after the end of the time series we are fitting. Finally, the upper bound 
of the critical time should not be too far away from the end of the time series since predictive 



capacity degrades far beyond ^2- We have empirically found elsewhere 23| one-third of the 
interval width to be a good cut-off. 

The combination of the heuristic and optimization results in a set of parameters 
A, B, C, m, u, (f) and tc for each of the 11,662 windows. Of these parameter sets, 2,568 satisfy 
the negative bubble condition f|TT]) . In Figure |2l we plot the histogram of critical time tc 
for these negative bubble fits and the negative logarithm of the S&P 500 time series. Peaks 
in this time series, then, indicate minima of the prices, many of these peaks being preceded 
by a fast acceleration with upward curvature indicating visually a faster-than-exponential 
growth of —p{t). This translates into accelerating downward prices. Notice that many of 
these peaks of — Inp(t) are followed by sharp drops, that is, fast rebounds in the regular 
+ \np{t). We see that peaks in — Inp(t) correspond to peaks in the negative bubble tc his- 
togram, implying that the negative bubbles qualified by the JLS model are often followed by 
rebounds. This suggests the possibility to diagnose negative bubbles and their demise in the 
form of a rebound or rally. If correct, this hypothesis would extend the proposition [4^, 
that financial bubbles can be diagnosed before their end and their termination time can be 
determined with an accuracy better than chance, to negative bubble regimes associated with 
downward price regimes. We quantify this observation below. 
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B. Definition of rebound 



The aim is first to recognize different patterns in the S&P 500 index from the 11,662 fits 
and then use the subset of 2,568 negative bubble fits to identify specific negative bubble 
characteristics. These characteristics will then be used to 'predict' (in a back-testing sense) 
negative bubbles and rebounds in the future. 

We first define a rebound, note as Rbd. A day is a rebound Rbd if the price on that 
day is the minimum price in a window of 200 days before and 200 days after it. That is, 

Rbd = {d\Pd = min{P,}, Vx E [d - 200, d + 200]} (16) 

where Pd is the adjusted closing price on day d. We find 19 rebounds of the ±200-days 



type 



5l| in the 59 year S&P 500 index history. Our task is to diagnose such rebounds 



in advance. We could also use other numbers instead of 200 to define a rebound. The 
predictability is stable with respect to a change of this number. This is because we learn 
from the learning set with a certain number type of rebounds and try to predict the rebounds 
of the same type. Later we will also show the results for ±365-days type of rebounds. 



C. Definitions and concepts needed to set up the pattern recognition method 

In what follows we describe a hierarchy of descriptive and quantitative terms as follows. 

• learning set. A subset of the whole set which only contains the fits with critical 
times in the past. We learn the properties of historical rebounds from this set and 
develop the predictions based on these properties. 

• classes. Two classes of fits are defined according to whether the critical time of a 
given fit is near some rebound or not, where 'near' will be defined below. 

• groups. A given group contains all fits of a given window size. 

• informative parameters. Informative parameters are the distinguishing parameters 
of fits in the same group but different classes. 

• questionnaires. Based on the value of an informative parameter, one can ask if a 
certain trading day is a start of rebound or not. The answer series generated by all 
the informative parameters is called questionnaire. 
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• traits. Traits are extracted from questionnaire. They are short and contain crucial 
information and properties of a questionnaire. 

• features. Traits showing the specific property of a single class are selected to be the 
feature of that class. 

• rebound alarm index. An index developed from features to show the probability 
that a certain day is a rebound. 

In this paper, we will show how all the above objects are constructed. Our final goal is 
to make predictions for the rebound time. The development of the rebound alarm index 
will enable us to achieve our goal. Several methodologies are presented to quantify the 
performance of the predictions. 

D. Classes 

I — I 

In the pattern recognition method of [13], one should define the learning set to find 
characteristics that will then be used to make predictions. We designate all fits before Jan. 
1, 1975 as the learning set Si: 



There are 4,591 fits in this set, which we all use without any pre-selection. No pre-selection 
for instance using Eq. f lTTj) is applied, on the basis of the robustness of the pattern recognition 
method. We then distinguish two different classes from Ei based on the critical time tc of 
the fits. For a single fit / with critical time t^j, if this critical time is within D days of 
a rebound, then we assign fit / to Class I, represented by the symbol Cj. Otherwise, / is 
assigned to Class II, represented by the symbol Cu. For this study, we chose D = 10 days 
because D too big will lose precision and D too small will take the noise into account. In this 
case. Class I fits are those with tc within 10 days of one of the 19 rebounds. We formalize 
this rule as: 



Si = {/ I tcj,t2j < Jan.1,1975} 



(17) 



Ci = {f \ f e Si,3(i e Rhd, s.t.\tcj-d\ < D}, 
Cn = {fife El, \tcj -d\> 10, Vc/ e Rbd}, 



D 



10 days. 



(18) 
(19) 
(20) 
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To be clear, Class I is formed by all the fits in learning set Si which have a critical time tc 
within 10 days of one of the rebounds. All of the fits in the learning set which are not in 
Class I are in Class II. 



E. Groups 

We also categorize all fits into separate groups (in addition to the two classes defined 
above) based on the length of the fit interval, Lf = dt = t2 — ti. We generate 14 groups, 
where a given group Gi is defined by: 

Gi = {f\Lfe [lOOi, lOOi + 100], 2 = 1, 2, 14, / G Si} (21) 

All 4,591 fits in the learning set are placed into one of these 14 groups. 



F. Informative Parameters 



For each fit in the learning set, we take 6 parameters to construct a flag that determines 
the characteristics of classes. These 6 parameters are m,u,(f) and B from Eq. ([8]), b (the 
negative bubble condition) from Eq. f lTOj) and q as the residual of the fit. 

We categorize these sets of 6 parameters for fits which are in the same group and same 
class. Then for each class-group combination, we calculate the probability density function 
(pdf) of each parameter using the adaptive kernel method 49|], generating 168 pdfs (6 
parameters x 2 classes x 14 groups). 

We compare the similarity (defined below) of the pdfs of each of the six parameters that 
are in the same group (window length) but different classes (proximity of tc to a rebound 
date). If these two pdfs are similar, then we ignore this parameter in this group. If the 
pdfs are different, we record this parameter of this group as an informative parameter. The 
maximum number of possible informative parameters is 84 (6 parameters x 14 groups). 

We use the Kolmogorov-Smirnov method jo] to detect the difference between pdfs. If the 
maximum difference of the cumulative distribution functions (integral of pdf) between two 
classes exceeds 5%, then this is an informative parameter. We want to assign a uniquely 
determined integer IPi to each informative parameter. We can do so by using three indexes, 
i,j and /. The index i indicates which group, with i G [1,14]. The index j indicates the 



14 



parameter, where j = 1, 2, 3, 4, 5, 6 refer to m, oo, 0, B, b, q, respectively. Finally, / represents 
the actual informative parameter. Assuming that there are L informative parameters in 
total and using the indexes, IPi is then calculated via 

IPi = 6t+j (22) 

for / e [1,L]. 

Given the L informative parameters I Pi, we consider the pdfs for the two different classes 
of a single informative parameter. The set of abscissa values within the allowed range given 
by equations (fT2] - [T5|) . for which the pdf of Class I is larger than the pdf of Class II, defines 
the domain Rgi^i ('good region') of this informative parameter which is associated with Class 
I. The other values of the informative parameters for which the pdf of Class I is smaller than 
the pdf of Class II define the domain Rgn,i which is associated with Class II. These regions 
play a crucial role in the generation of questionnaires in the next section. 

Our hypothesis is that many "positive" and "negative bubbles" share the same structure 
described by the JLS model, because they result from the same underlying herding mech- 
anism. However, nothing a priori imposes that the control parameters should be identical. 
Note that our pattern recognition methodology specifically extract the typical informative 
parameter ranges that characterize the "negative bubbles". 

G. Intermediate summary 

We realize that many new terms are being introduced, so in an attempt to be absolutely 
clear, we briefly summarize the method to this point. We sub-divide a time series into many 
windows (^1,^2) of length Lf = t2 — ti. For each window, we obtain a set of parameters that 
best fit the model ([8]). Each of these windows will be assigned one of two classes and one of 
14 groups. Classes indicate how close the modeled critical time is to a historical rebound, 
where Class I indicates 'close' and Class II indicates 'not close'. Groups indicate the length 
of the window. For each fit, we create a set of six parameters: m, u, (j) and B from Eq. (JHl), 
h (the negative bubble condition) from Eq. ( |T0|) and q as the residual of the fit. We create 
the pdfs of each of these parameters for each fit and define informative parameters as those 
parameters for which the pdfs differ significantly according to a Kolmogorov-Smirnov test. 
For each informative parameter, we find the regions of the abscissa of the pdf for which the 
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Class I pdf (fits with tc close to a rebound) is greater than the Class II pdf. For informative 
parameter / (defined in fl22l) ). this region is designated as Rgi^i- In the next section, we will 
use these regions to create questionnaires that will be used to predictively identify negative 
bubbles that will be followed by rebounds. 

Another important distinction to remember at this point is that the above method has 
been used to find informative parameters that will be used below. Informative parameters 
are associated with a class and a group. 

H. Questionnaires 

Using the informative parameters and their pdfs described above, we can generate ques- 
tionnaires for each day of the learning or testing set. Questionnaires will be used to identify 
negative bubbles that will be followed by rebounds. The algorithm for generating question- 
naires is the following: 

I. Obtain the maximum (tcmax) and minimum (tcmin) values of tc from some subset T^sub, 
either the 'learning' set or the 'predicting (testing)' set of all 11,662 fits. 

2. Scan each day tgcan from tcmin to tcmax- There will be = tcmax — ^cmin + 1 days to 
scan. For each scan day, create a new set St^^^^ consisting of all fits in subset T,sub 
that have a t^ near the scan day tgcan, where 'near' is defined using the same criterion 
used for defining the two classes, namely D = 10 days: 

Stscan = {/ I \^cj - tscanl < D,f e Ssnfe} (23) 

The number i^St^^.^„ of fits in each set can be or greater. The sum of the number of 
fits found in all of the sets J2ll'^^^=t^^.^ H^St^can can actually be greater than the total 
number of fits in S^uf, since some fits can be in multiple sets. Notice that the fits 
in each set St^^^„ can (and do) have varying window lengths. At this point, only the 
proximity to a scan day is used to determine inclusion in a scan set. 

3. Assign a group to each of the fits in Stscan- Recall that groups are defined in Eq. ( ETj) 
and are based on the window length Lf = dt = t2 — ti. 

4. Using all sets Stscan, for each informative parameter I Pi found in Sec. IIIIFt determine 
if it belongs to Class I (close to a rebound) or Class II (not close to a rebound). There 
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are 3 possible answers: 1 = 'belongs to Class I', -1 = 'belongs to Class IV or = 
'undetermined'. 



The status of 'belonging to Class I' or not is determined as follows. First, find all values 
of the informative parameter I Pi in a particular scan set St^^^^. For instance, if for a 
particular scan day tgcan, there are n fits in the subset S^ab that have tc 'near' tgcan, then 
the set Stg^^„ contains those n fits. These n fits include windows of varying lengths so that 
the windows themselves are likely associated with different groups. Now consider a given 
informative parameter I Pi and its underlying parameter j (described in Sec. IIIIF|) that has 



an associated 'good region', Rgj^i. Remember that this informative parameter I Pi has an 
associated group. Count the number p of the n fits whose lengths belong to the associated 
group of I Pi. If more of the values of the underlying parameter of p lie within Rgj^i than 
outside of it, then IPi belongs to Class I and, thus, the 'answer' to the question of 'belonging 
to Class r is a = 1. If, on the other hand, more values lie outside the 'good region' Rgi^i 
than in it, the answer is a = — 1. If the same number of values are inside and outside of 
Rgi^i then a = 0. Also, if no members of St^^^„ belong to the associated group of IPi then 
a = 0. 

To assist more in that understanding, let us have a look at an example. Assume that 
the informative parameter information tells us parameter m in Group 3 is the informative 
parameter I Pig and m G [A, B] is the 'good region' Rgij of Class I. We consider a single 
tscan and find that there are two fits in 5*^^^^,^ in this group with parameter m values of mi 
and m2- We determine the 'answer' a = aiPi^ as follows: 

• If nil, 1712 £ we say that based on I Pig (Group 3, parameter m) that fits near 
tscan belong to Class I. Mark this answer as a/p^g = 1. 

• If nil £ [A, B] and ni2 ^ [A, B], we say that fits near tscan cannot be identified and so 
a/Pi9 = 0. 

• If nii,ni2 ^ [A, B], fits near tgcan belong to Class II and a/p^g = —1. 
More succinctly, 

1 if mi,m2 e [A,B] 
am, = \ ifm^e[A,B],m,^[A,B],t^j,t,je{l,2} (24) 
— 1 if nil, m2 ^ [A, B] 
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For each of the informative parameters, we get an answer a that says that fits near tgcan 
belong to Class I or II (or cannot be determined). For a total of L informative parameters, 
we get a questionnaire A of length L: 

^Ucau = aia2a^...aL, ai E {-1, 0, 1} (25) 

Qualitatively, these questionnaires describe our judgement to whether tscan is a rebound or 
not. This judgement depends on the observations of informative parameters. 

I. Traits 

The concept of a trait is developed to describe the property of the questionnaire for each 
tgcan- Each questionnaire can be decomposed into a fixed number of traits if the length of 
questionnaire is fixed. 

From any questionnaire with length L, we generate a series of traits by the following 
method. Every trait is a series of 4 to 6 integers, t — p,q, r, (P, Q, R). The first three terms 
p, q and r are simply integers. The term (P, Q, R) represents a string of 1 to 3 integers. We 
first describe p, q and r and then the (P, Q, R) term. 

The integers p, q and r have limits: p E 1,2, L,q E p,p + 1, L,r E q,q + 1, L. 
We select all the possible combinations of bits from the questionnaire At^^^^ with the condi- 
tion that each time the number of selected questions is at most 3. We record the numbers 
of the selected positions and sort them. The terms p, q and r are selected position numbers 
and defined as follows: 

• If only one position ii is selected: r — q — p — ii 

• If two ii, ^2 are selected: p — ii,r — q — i2{ii < 12) 

• If three ii, 12, is are selected: p = ii,q = i2,T = ^3(^1 < 12 < is) 

The term (P, Q, R) is defined as follows: 

r = q = p, (P Q, R) = ap (26) 
r = q,q^P, (P, Q, R) = ap, (27) 
r^q.q^P, (P, Q, R) = Op, Og, Or (28) 
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As an example, A = (0,1,-1,-1) has traits in Table HTl 

For a questionnaire with length L, there are 3L + 3^Q^ +^^(3) possible traits. However, 
a single questionnaire has only L + (^2) + (3) traits, because (P,Q,R) is defined by p,q and 
r. In this example, there are 14 traits for questionnaire (0,1,-1,-1) and 174 total traits for 
all possible L = 4 questionnaires. 

J. Features 

At the risk of being redundant, it is worth briefly summarizing again. Until now we 
have: L informative parameters IPi, IP2, . . . , IPl from 84 different parameters (84 = 
6 parameters x 14 Groups) and a series of questionnaires At^^„„ for each tgcan from tcmin to 
tcmax using Set St^^^^. These questionnaires depend upon which subset T,sub of flts is chosen. 
Each questionnaire has a sequence of traits that describe the property of this questionnaire 
in a short and clear way. Now we generate features for both classes. 

Recall that the subset of flts feature that we use here is that which contains all flts 
which have a critical time tc earlier than tp = 1975-01-01, feature = {/ | ^c,/ < tp}. By 
imposing that ^2 and tcj are both smaller than tp, we do not use any future information. 
Considering the boundary condition of critical times in Eq. ( IT5|) . the end time of a certain 
flt ^2 is less than or equal to t^. Additionally, we select only those critical times such that 

tc,f ^ tpi'^f ^ '^feature- 

Assume that there are two sets of traits Tj and Tjj corresponding to Class I and Class 
II, respectively. Scan day by day the date t from the smallest tc in feature until tp. If t is 
near a rebound (using the same D = 10 day criterion as before), then all traits generated 
by questionnaire At belong to Tj. Otherwise, all traits generated by At belong to Tu. 

Count the frequencies of a single trait r in Tj and Tjj. If r is in Tj for more than a 
times and in Tjj for less than /3 times, then we call this trait r a feature Fj of Class I. 
Similarly, if r is in Tj for less than a times and in Tjj for more than /3 times, then we call 
r a feature Fu of Class II. The pair {a, (3) is deflned as a feature qualification. We will vary 
this qualiflcation to optimize the back tests and predictions. 
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K. Rebound alarm index 



The final piece in our methodology is to define a rebound alarm index that will be used in 
the forward testing to 'predict' rebounds. Two types of rebound alarm index are developed. 
One is for the back tests before 1975-01-01, as we have already used the information before 
this time to generate informative parameters and features. The other alarm index is for the 
prediction tests. We generate this prediction rebound alarm index using only the information 
before a certain time and then try to predict rebounds in the 'future' beyond that time. 

IV. BACK TESTING 

A. Features of learning set 

Recall that a feature is a trait which frequently appears in one class but rarely in the 
other class. Features are associated with feature qualification pairs {a, Using all the fits 
from subset S feature found in Sec. 1111 Jt we generate the questionnaires for each day in the 
learning set, i.e., the fits with tc before 1975-01-01. Take all traits from the questionnaire 
At for a particular day t and compare them with features Fj and Fjj. The number of traits 
in Fi and Fjj are called utj and utji- Then we define: 



From the definition, we can see that RIt G [0, 1]. If RIt is high, then we expect that this 
day has a high probability that the rebound will start. 

We choose feature qualification pair (10, 200) here, meaning that a certain trait must 
appear in trait Class I at least 11 times and must appear in trait Class II less than 200 
times. If so, then we say that this trait is a feature of Class I. If, on the other hand, the 
trait appears 10 times or less in Class I or appears 200 times or more in Class II, then this 
trait is a feature of Class II. The result of this feature qualification is shown in Figure |3l 
Note that the choice (10, 200) is somewhat arbitrary and does not constitute an in-sample 
optimization on our part. This can be checked from the error diagrams presented below, 
which scan these numbers: one can observe in particular that the pair (10, 200) does not give 
the best performance. We have also investigated the impact of changing other parameters 



RIt 



= < 




if + utji > 
if utj + VtJi = 



(29) 
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and find a strong robustness. 

With this feature quahfication, the rebound alarm index can distinguish rebounds with 
high significance. If the first number a is too big and the second number /3 is too small, then 
the total number of Class I features will be very small and the number of features in Class II 
will be large. This makes the rebound alarm index always close to 0. In contrast, if a is too 
small and /3 is too large, the rebound alarm index will often be close to 1. Neither of these 
cases, then, is qualified to be a good rebound alarm index to indicate the start of the next 
rebound. However, the absolute values of feature qualification pair are not very sensitive 
within a large range. Only the ratio plays an important role. Figures - El show that 
varying a and /3 in the intervals 10 < a < 20 and 200 < /3 < 1000 does not change the result 
much. For the sake of conciseness, only the rebound alarm index of feature qualification 
pair (10, 200) is shown in this paper. 

B. Predictions 

Once we generate the Class I and II features of the learning set for values of tf. before 
(Jan. 1, 1975), we then use these features to generate the predictions on the data after tp. 
Recall that the windows that we fit are defined such that the end time ^2 increases 50 days 
from one window to the next. Also note that all predictions made on days between these 50 
days will be the same because there is no new fit information between, say, t"^ and t^~^ . 

Assume that we make a prediction at time t: 

t G (^2,^2 + 50], t>tp (30) 

Then the fits set Tjt^ = {/ I ^2,/ < ^2} is made using the past information before prediction 
day t. We use as the subset Y^suh mentioned in Sec. IIIIHI to generate the questionnaire 
on day t and the traits for this questionnaire. Comparing these traits with features Fi and 
Fii allows us to generate a rebound alarm index Rli using the same method as described in 
Sec. HVAl 

Using this technique, the prediction day ^2 is scanned from 1975-01-01 until 2009-07-22 
in steps of 50 days. We then construct the time series of the rebound alarm index over this 
period and with this resolution of 50 days. The comparison of this rebound alarm index 
with the historical financial index (Figure H]) shows a good correlation, but there are also 
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some false positive alarms (1977, 1998, 2006), as well as some false negative missed rebounds 
(1990). Many false positive alarms such as in 1998 and 2006 are actually associated with 
rebounds. But these rebounds have smaller amplitudes than our qualifying threshold targets. 
Concerning the false negative (missed rebound) in 1990, the explanation is probably that 
the historical prices preceeding this rebound does not follow the JLS model specification. 
Rebounds may result from several mechanisms and the JLS model only provides one of 
them, arguably the most important. Overall, the predictability of the rebound alarm index 
shown in Figure HJ as well as the relative cost of the two types of errors (false positives and 
false negatives) can be quantified systematically, as explained in the following sections. The 
major conclusion is that the rebound alarm index has a prediction skill much better than 
luck, as quantified by error diagrams. 



C. Error Diagram 

We have qualitatively seen that the feature qualifications method using back testing and 
forward prediction can generate a rebound alarm index that seems to detect and predict well 
observed rebounds in the S&P 500 index. We now quantify the quality of these predictions 
with the use of error diagrams 0, js^]- We create an error diagram for predictions after 
1975-01-01 with a certain feature qualification in the following way: 

1. Count the number of rebounds after 1975-01-01 as defined in section UlI Bl and expres- 
sion (fT6i) . There are 9 rebounds. 

2. Take the rebound alarm index time series (after 1975-01-01) and sort the set of all 
alarm index values in decreasing order. There are 12,600 points in this series and the 
sorting operation delivers a list of 12,600 index values, from the largest to the smallest 
one. 

3. The largest value of this sorted series defines the first threshold. 

4. Using this threshold, we declare that an alarm starts on the first day that the unsorted 
rebound alarm index time series exceeds this threshold. The duration of this alarm 
Da is set to 41 days, since the longest distance between a rebound and the day with 
index greater than the threshold is 20 days. Then, a prediction is deemed successful 
when a rebound falls inside that window of 41 days. 
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5. If there are no successful predictions at this threshold, move the threshold down to 
the next value in the sorted series of alarm index. 

6. Once a rebound is predicted with a new value of the threshold, count the ratio of 
unpredicted rebounds (unpredicted rebounds / total rebounds in set) and the ratio of 
alarms used (duration of alarm period / 12,600 prediction days). Mark this as a single 
point in the error diagram. 

In this way, we will mark 9 points in the error diagram for the 9 rebounds. 

The aim of using such an error diagram in general is to show that a given prediction 
scheme performs better than random. A random prediction follows the line y = 1 — x in the 
error diagram. A set of points below this line indicates that the prediction is better than 
randomly choosing alarms. The prediction is seen to improve as more error diagram points 
are found near the origin (0, 0). The advantage of error diagrams is to avoid discussing how 
different observers would rate the quality of predictions in terms of the relative importance 
of avoiding the occurrence of false positive alarms and of false negative missed rebounds. 
By presenting the full error diagram, we thus sample all possible preferences and the unique 
criterion is that the error diagram curve be shown to be statistically significantly below the 
anti-diagonal y = 1 — x. 

In Figure [5|, we show error diagrams for different feature qualification pairs (a,/3). Note 
the 9 points representing the 9 rebounds in the prediction set. We also plot the 11 points 
of the error diagrams for the learning set in Figure |6l 

As a different test of the quality of this pattern recognition procedure, we repeated the 
entire process but with a rebound now defined as the minimum price within a window of 
2 X 365 days^] instead of 2 x 200 days, as before. These results are shown in Figures OlHl 

D. Bayesian inference 

Given a value of the predictive rebound alarm index, we can also use the historical rebound 
alarm index combined with Bayesian inference to calculate the probability that this value 
of the rebound alarm index will actually be followed by a rebound. We use predictions near 
the end of November, 2008 as an example. From Figure HI we can see there is a strong 
rebound signal in that period. We determine if this is a true rebound signal by the following 
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method: 

1. Find the highest rebound alarm index Lv around the end of November 2008. 

2. Calculate Dtotau the number of days in the interval from 1975-01-01 until the end of 
the prediction set, 2009-07-22. 

3. Calculate Dlv, the number of days which have a rebound alarm index greater than or 
equal to Lv. 

4. The probability that the rebound alarm index is higher than Lv is estimated by 

P{RI > Lv) = (31) 

Dtotal 

5. The probabihty of a day being near the bottom of a rebound is estimated as the 
number of days near real rebounds over the total number of days in the predicting set: 

P{rebound) = ^^"i)^'"'^" , (32) 

where Nrebound is the number of rebounds we can detect after 1975-01-01 and is 
the rebound width, i.e. the number of days near the real rebound in which we can say 
that this is a successful prediction. For example, if we say that the prediction is good 
when the predicted rebound time and real rebound time are within 10 days of each 
other, then the rebound width Drw = 10x2-1-1 = 21. 

6. The probability that the neighbor of a rebound has a rebound alarm index larger than 

Lv is estimated as 

Nn 

P{RI > Lv\rebound) = ° (33) 

^ 'I rebound 

where A^o is the number of rebounds in which 

sup Rid > Lv. (34) 

\d—rebound\<20 

7. Given that the rebound alarm index is higher than Lv, the probability that the rebound 
will happen in this period is given by Bayesian inference: 

P{rebound\RI > Lv) = P(^J|^L.|re5o.nd) ^^^^ 
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Averaging P{rebound\RI > Lv) for all the different feature qualifications gives the proba- 
bility that the end of November 2008 is a rebound as 0.044. By comparing with observations, 
we see that this period is not a rebound. We obtain a similar result by increasing the def- 
inition of rebound from 200 days before and after a local minimum to 365 days, yielding a 
probability of 0.060. 

When we decrease the definition to 100 days, the probability that this period is a rebound 
jumps to 0.597. The reason for this sudden jump is shown in Figure [9] where we see the 
index around this period and the S&P 500 index value. From the figure, we find that this 
period is a local minimum within 100 days, not more. This is consistent with what Bayesian 
inference tells us. However, we have to address that the more obvious rebound in March 
2009 is missing in our rebound alarm index. Technically, one can easily find that this is 
because the end of crash is not consistent with the beginning of rebound in this special 
period. 

In this case, we then test all the days after 1985-01-01 systematically by Bayesian infer- 
ence using only prediction data (rebound alarm index) after 1975-01-01. To show that the 
probability that RI > Lv is stable, we cannot start Bayesian inference too close to the initial 
predictions so we choose 1985-01-01 as the beginning time. We have 5 'bottoms' (troughs) 
after this date, using the definition of a minimum within ±200 days. 

For a given day d after 1985-01-01, we know all values of the rebound alarm index from 
1975-01-01 to that day. Then we use this index and historical data of the asset price time 
series in this time range to calculate the probability that d is the bottom of the trough, 
given that the rebound alarm index is larger than Lv, where Lv is defined as 

Lv = sup RIt (36) 

cZ-t<50 

To simplify the test, we only consider the case of feature qualification pair (10, 200), 
meaning that the trait is a feature of Class I only if it shows in Class I more than 10 
times and in Class II less than 200 times. Figure [10] shows that the actual rebounds occur 
near the local highest probability of rebound calculated by Bayesian inference. This figure 
also illustrates the existence of false positive alarms, i.e., large peaks of the probability not 
associated with rebounds that we have characterized unambiguously at the time scale of 
±200 days. 
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V. TRADING STRATEGY 



In order to determine if the predictive power of our method provides a genuine and useful 
information gain, it is necessary to estimate the excess return it could generate. The excess 
return is the real return minus the risk free rate transformed from annualized to the duration 
of this period. The annualized 3-month US treasury bill rate is used as the risk free rate 
in this paper. We thus develop a trading strategy based on the rebound alarm index as 
follows. When the rebound alarm index rises higher than a threshold value Th, then with a 
lag of Os days, we buy the asset. This entry strategy is complemented by the following exit 
strategy. When the rebound alarm index goes below Th, we still hold the stock for another 
Hp days, with one exception. Consider the case that the rebound alarm index goes below 
Th at time ti and then rises above Th again at time t2. If ^2 — is smaller than the holding 
period Hp, then we continue to hold the stock until the next time when the rebound alarm 
index remains below Th for Hp days. 

The performance of this strategy for some fixed values of the parameters is compared 
with random strategies, which share all the properties except for the timing of entries and 
exits determined by the rebound alarm index and the above rules. The random strategies 
consist in buying and selling at random times, with the constraint that the total holding 
period (sum of the holding days over all trades in a given strategy) is the same as in the 
realized strategy that we test. Implementing 1000 times these constrained random strategies 
with different random number realizations provide the confidence intervals to assess whether 
the performance of our strategy can be attributed to real skill or just to luck. 

Results of this comparison are shown in Table [XTTl for two sets of parameter values. The p- 
value is a measure of the strategies' performance, calculated as the fraction of corresponding 
random strategies that are better than or equal to our strategies. The lower the p-value 
is, the better the strategy is compared to the random portfolios. We see that all of our 
strategies' cumulative excess returns are among the top 5-6% out of 1000 corresponding 
random strategies' cumulative excess returns. Box plots for each of the strategies are also 
presented in Figures [TTlfT^ 

The cumulative returns as well as the cumulative excess returns obtained with the two 
strategies as a function of time are shown in Figures [T3lfT^ These results suggest that these 
two strategies would provide significant positive excess return. Of course, the performance 
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obtained here are smaller than the naive buy-and-hold strategy, consisting in buying at the 
beginning of the period and just holding the position. The comparison with the buy-and- 
hold strategy would be however unfair as our strategy is quite seldom invested in the market. 
Our goal here is not to do better than any other strategy but to determine the statistical 
significance of a specific signal. For this, the correct method is to compare with random 
strategies that are invested in the market the same fraction of time. It is obvious that we 
could improve the performance of our strategy by combining the alarm indexes of bubbles 
and of negative bubbles, for instance, but this is not the goal here. 

We also provide the Sharpe ratio as a measure of the excess return (or risk premium) per 
unit of risk. We define it per trade as follows 

S = El^^ (37) 

cr 

where R is the return of a trade, Rj is the risk free rate (we use the 3-month US treasury 
bill rate) transformed from annualized to the duration of this trade given in Table IIIII and 
a is the standard deviation of the returns per trade. The higher the Sharpe ratio is, the 
higher the excess return under the same risk. 

The bias ratio is defined as the number of trades with a positive return within one 
standard deviation divided by one plus the number of trades which have a negative return 
within one standard deviation: 

#{r|rg[0,a]} 
" l + #{r|rG[-a,0)} ^^^^ 
In Eq. (!38|) . r is the excess return of a trade and a is the standard deviation of the excess 
returns. This ratio detects valuation bias. 

To see the performance of our strategies, we also check all the possible random trades 
with a holding period equals to the average duration of our strategies, namely 25 days and 17 
days for strategy I and II respectively. The average Sharpe and bias ratios of these random 
trades are shown in Table IIIII Both Sharpe and bias ratios of our strategies are greater than 
those of the random trades, confirming that our strategies deliver a larger excess return with 
a stronger asymmetry towards positive versus negative returns. 

As another test, we select randomly the same number of random trades as in our strate- 
gies, making sure that there is no overlap between the selected trades. We calculate the 
Sharpe and bias ratios for these random trades. Repeating this random comparative se- 
lection 1000 times provides us with p-values for the Sharpe ratio and for bias ratio of our 

27 



strategies. The results are presented in Table lllli All the p- values are found quite small, 
confirming that our strategies perform well. 



VI. CONCLUSION 

We have developed a systematic method to detect rebounds in financial markets using 
"negative bubbles," defined as the symmetric of bubbles with respect to a horizontal line, 
i.e., downward accelerated price drops. The aggregation of thousands of calibrations in run- 
ning windows of the negative bubble model on financial data has been performed using a 
general pattern recognition method, leading to the calculation of a rebound alarm index. 
Performance metrics have been presented in the form of error diagrams, of Bayesian infer- 
ence to determine the probability of rebounds and of trading strategies derived from the 
rebound alarm index dynamics. These different measures suggest that the rebound alarm 
index provides genuine information and suggest predictive ability. The implemented trading 
strategies outperform randomly chosen portfolios constructed with the same statistical char- 
acteristics. This suggests that financial markets may be characterized by transient positive 
feedbacks leading to accelerated drawdowns, which develop similarly to but as mirror images 
of upward accelerating bubbles. Our key result is that these negative bubbles have been 
shown to be predictably associated with large rebounds or rallies. 

In summary, we have expanded the evidence for the possibility to diagnose bubbles before 



they terminate |47|, by adding the phenomenology and modeling of "negative bubbles" and 
their anticipatory relationship with rebounds. The present paper contributes to improving 
our understanding of the most dramatic anomalies exhibited by financial markets in the form 
of extraordinary deviations from fundamental prices (both upward and downward) and of 
extreme crashes and rallies. Our results suggest a common underlying origin to both positive 
and negative bubbles in the form of transient positive feedbacks leading to identifiable and 
reproducible faster-than-exponential price signatures. 
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VII. LIST OF SYMBOLS 



TABLE I: List of symbols 

h{t) hazard rate 

p{t) stock price 

A, B, C linear parameters of the JLS model 

tc critical time in the JLS model at which the bubble ends 

m exponent parameter in the JLS model 

Lo frequency parameter in the JLS model 

(f) phase parameter in the JLS model 

b parameter controlling the positivity of the hazard rate in the JLS model 

Rbd rebound time 

Ci set of Class I fits 

Cji set of Class II fits 

Gi set of Group i fits 

IP informative parameter 

A questionnaire 

T trait 

(a, P) feature qualification pair 

RI rebound alarm index 

Lv highest rebound alarm index around a certain time 

Th threshold value for the trading strategy 

Os offset for the trading strategy 

Hp holding period for the trading strategy 

S Sharpe ratio 

Rf risk free rate 

r,R — Rf excess return of a trade 

BR bias ratio 

# number of a set 
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TABLE II: Traits for series A = (0,1,-1,-1) 
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TABLE III: Performances of two strategies: Strategy I {Th = 0.2, Os = 10, Hp = 10) and Strategy 
II {Th = 0.7, Os = 30, Hp = 10). 

Strategy I Strategy II 



Threshold Th 0.2 

Offset Os 10 

Holding period Hp 10 

Number of trades 77 
Success rate 

(fraction of trades with positive return) 
Total holding days 

Fraction of time when invested 15.0% 

Cumulated log-return 95% 

cumulated excess log-return 67% 

Average return per trade 1.23% 
Average trade duration 
p- value of cumulative excess return 

Sharpe ratio per trade 0.247 
Sharpe ratio of random trades 

(holding period equals average trade duration) 0.025 

p-value of Sharpe ratio 0.043 

Bias ratio 1.70 
Bias ratio of random trades 

(holding period equals average trade duration) 1.27 

p- value of bias ratio 0.105 



0.7 

30 
10 
38 

66.2% 65.8% 
1894 days 656 days 
5.2% 
45% 
35% 
1.19% 
24.60 days 17.26 days 
0.055 0.058 
0.359 

0.021 

0.036 
1.36 

1.25 
0.309 
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FIG. 1: Upper panel: Significant drawdown of nearly 50% from 1973-01-01 to 1974-10-01 (time 
window delineated by the two black dashed ver|iical lines) with very clear log-periodic oscillations, 
followed by a strong positive rebound. The best fits from taboo search are used to form a 90% 



Histogram for rebound time t^. of S&P 500. e[t2,t^ +0.375(t2-ii)]. 
Window size max: 1500, min: 110. Bins: 1500. &<0,B>0, 2568 fits 
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FIG. 2: (upper) Histogram of the critical times tc over the set of 2,568 time intervals for which 
negative bubbles are detected by the condition that the fits of Inp(t) by expression ^ satisfy 
condition (lower) Plot of —\np{t) versus time for the S&P 500 index. Note that peaks in 

this figure correspond to valleys in actual price. 
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FIG. 3: Rebound alarm index and log-price of the S&P 500 Index for the learning set, where ^2 and 
tc are both before Jan. 1, 1975. (upper) Rebound alarm index for the learning set using feature 
qualification pair (10, 200). The rebound alarm index is in the range [0, 1]. The higher the rebound 
alarm index, the more likely is the occurrence of a rebound, (lower) Plot of In p(t) versus time of 
S&P Index. Red vertical lines indicate rebounds defined by local minima within plus and minus 
200 days around them. Note that these rebounds are the historical "change of regime" rather than 
only the jump-like reversals. The jump-like reversals, 1972, 1974 as examples, are included in these 
rebounds. They are located near clusters of high values of the rebound alarm index of the upper 
figure. 
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S&P 500 rebound alarm index. 
Use information before Jan.l, 1975, predict rebounds until Jun. 3, 2009 
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FIG. 4: Rebound alarm index and log-price of S&P 500 Index for the predicting set after Jan. 1, 
1975. (upper) Rebound alarm index for predicting set using feature qualification pair (10,200). 
The rebound alarm index is in the range [0,1]. The higher the rebound alarm index, the more 
likely is the occurrence of a rebound, (lower) Plot of Inp(t) versus time of the S&P Index. Red 
vertical lines indicate rebounds defined by local minima within in plus and minus 200 days. They 
are located near clusters of high values of the rebound alarm index of the upper figure. 
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FIG. 5: Error diagram for predictions after Jan. 1, 1975 with different types of feature qualifica- 
tions. Feature qualification a, (3 means that, if the occurrence of a certain trait in Class I is larger 
than a and less than /3, then we call this trait a feature of Class I and vice versa. See text for more 
information. 
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Error Diagram for S&P 500 back tests. 




Alarm period / Total period 



FIG. 6: Same as Figure [5] but for the learning set before Jan. 1, 1975. 



Error Diagram for S&P 500 predictions. 
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FIG. 7: Same as Figure [5] but with the different definition of a rebound determined as the day with 
the smallest price within the 365 days before it and the 365 days after it. 
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Error Diagram for S&P 500 back tests. 
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FIG. 8: Same as Figure [7] for the learning set before Jan. 1, 1975. 
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S&P 500 rebound alarm index near November 2008. 
^Probability of rebound given the rebound alarm index is calculated by Bayesian inference. 
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S&P 500 index near November 2008. 



Red ^vertical lines are the local minima of this period. 109 days between two local minima. 
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FIG. 9: Rebound alarm index and market price near and after November 2008. 
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Probability of rebound obtained using Bayesian inference. 
Rebound is bottom of [-200, 200] days. Red daslied lines represent real rebounds. 




FIG. 10: Probability of rebound as a function of time t, given the value of the rebound alarm index 
at t, derived by Bayesian inference applied to bottoms at the time scale of ±200 days. The feature 
qualification is (10, 200). Lv is the largest rebound index in the past 50 days. The vertical red 
lines show the locations of the realized rebounds in the history of the S&P500 index. 
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Box plot for strategy with holding period 10 days, threshold 0.2 and offset 10 
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1000 times repeats. Red circle is the performance of our strategy. 



FIG. 11: Box plot for Strategy I {Th = 0.2, Os = 10, Hp = 10). Lower and upper horizontal 
edges (blue lines) of box represent the first and third quartiles. The red line in the middle is the 
median. The lower and upper black lines are the 1.5 interquartile range away from quartiles. Points 
between quartiles and black lines are outliers and points out of black lines are extreme outliers. 
Our strategy return is marked by the red circle. This shows our strategy is an outlier among the 
set of random strategies. The log-return ranked 55 out of 1000 random strategies. 
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1000 times repeats. Red circle is the performance of our strategy. 



FIG. 12: Same as figure [TT] for Strategy II {Th = 0.7, Os = 30, Hp = 10). The log-return ranked 
58 out of 1000 random strategies. 
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Wealth trajectory for strategy: threshold: 0.2, offset: 10 days, holding period: 10 days. 
Cumulated log-return (excess log-return): 94.84% (66.99%), 77 trades in total. 
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FIG. 13: Wealth trajectory for Strategy I {Th = 0.2, Os = 10, Hp = 10). Major performance 
parameters of this strategy are: 77 trading times; 66.2% trades have positive return; 1894 total 
holding days, which is 15.0% of total time. Accumulated log-return is 95% and average return per 
trade is 1.23%. Average trade length is 24.60 days. P-value of this strategy is 0.055 
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Wealth trajectory for strategy: threshold: 0.7, offset: 30 days, holding period: 10 days. 
Cumulated log-return (excess log-return): 45.03% (34.97%), 38 trades in total. 
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FIG. 14: Wealth trajectory for Strategy II {Th = 0.7, Os = 30, Hp = 10). Major performance 
parameters of this strategy are: 38 trading times; 65.8% trades have positive return; 656 total 
holding days, which is 5.2% of total time. Accumulated log-return is 45% and average return per 
trade is 1.19%. Average trade length is 17.26 days. P-value of this strategy is 0.058 



49 



